-
Notifications
You must be signed in to change notification settings - Fork 99
Worker heartbeat: New in-memory metrics mechism, plumb rest of heartbeat data #1023
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: worker-heartbeat
Are you sure you want to change the base?
Worker heartbeat: New in-memory metrics mechism, plumb rest of heartbeat data #1023
Conversation
…g, counter_with_in_mem and and_then()
…, remove stale metric previously added
impl WorkerHeartbeatMetrics { | ||
pub fn get_metric(&self, name: &str) -> Option<HeartbeatMetricType> { | ||
match name { | ||
"sticky_cache_size" => Some(HeartbeatMetricType::Individual( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All these names are duplicative of some existing metric name in core/src/telemetry/metrics.rs
, which is calling through into this anyway. Rather than having all the fields be pub, I think it'd be safer to make getters for each, individually, that do the same thing this match does.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that do the same thing this match does.
Not sure I follow this part, happy to make getters so all fields aren't pub, but it sounds like you're suggesting a change to the match itself? We need some way to map the str name to the struct field. Are you suggesting moving this whole fn over to MetricParameters
?
…ers directly, WithLabel API improvement
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like some of the tests are failing too
#[builder(default)] | ||
pub plugins: Vec<PluginInfo>, | ||
|
||
/// Skips the single worker+client+namespace+task_queue check |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
GH not letting me suggest this line for whatever crazy reason, but:
"Skips the check that enforces only one worker per client may exist with the same namespace/task queue combination"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But, actually I guess it's literally disabling registration entirely. If this is true, even if there are no other conflicting workers, will it end up disabling the worker heartbeating? Definitely shouldn't if it does.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yikes! That's def not what we want. Added a test to verify this behavior
398b5ce
to
33fd78d
Compare
33fd78d
to
f1a3634
Compare
What was changed
NOTE: targeting
worker-heartbeat
feature branch.New in memory mechanism to keep track of certain metrics for worker heartbeating.
Why?
Checklist
Closes
How was this tested:
Note
Introduce in-memory metrics for worker heartbeats, populate/export detailed heartbeat, poller, and slot info, track slot-supplier kind, and add DescribeWorker/SetWorkerDeploymentManager APIs with tests.
HeartbeatMetricType
,WorkerHeartbeatMetrics
) and wire intoCounter
/Gauge
/HistogramDuration
via*_with_in_memory
helpers.MetricsContext::Instruments
to record heartbeat-related counters/gauges/histograms; add NoOp attributes support.WorkerHeartbeat
with SDK identity, times, poller info (incl. last successful poll), slot usage, sticky cache, plugins, and status.status
and exposeworker_instance_key()
on API.WorkerConfig
fields:plugins
,skip_client_worker_set_check
.Fixed
/ResourceBased
/custom) and include in heartbeat.ClientWorkerSet::register
acceptsskip_client_worker_set_check
; heartbeat callback usesArc
.DescribeWorker
RPC and wire through client/C-bridge.SetWorkerDeploymentManager
RPC and related messages.Written by Cursor Bugbot for commit fc7f839. This will update automatically on new commits. Configure here.